Privacy-Preserving Schema Matching Using Mutual Information
نویسندگان
چکیده
The problem of schema or ontology matching is to define mappings among schema or ontology elements. Such mappings are typically defined between two schemas or two ontologies at a time. Ideally, using the defined mappings, one would be able to issue a single query that will be rewritten automatically to all the databases, instead of manually writing a query to each database. In a centrally mediated architecture a query is written in terms of a global schema or ontology that integrates all the database schemas or ontologies, while in a peer-to-peer architecture a query is written in terms of the schema or of the ontology of any of the peer databases. Automatic schema matching approaches can use only the schema, only the instances, or a combination of both. Mappings can take into account not only concept properties (e.g., string similarity), but also constraints (e.g., relationship cardinality) and schema structure (e.g., graph similarity) [9]. Security and privacy issues arise in the context of data integration. For example, previous work looks into secure access to mediated data [2, 4]. Other work has defined the concept of minimal necessary information sharing that applies to querying: in computing the answer to a query, only the query result should be revealed [1]. Most matching approaches rely on the fact that both schemas or ontologies are completely visible by both parties. Clearly, this approach disregards security and privacy considerations. Even within the same organization, different users have access to different database views. It is, therefore, only natural to create automatic mechanisms by which mappings can be established between a pair of schemas or ontologies, without each party needing to reveal their whole metadata. Clifton et al. discuss issues and identify research directions in privacy-preserving data integration, including those that arise in schema matching [3]. More recently, Mitra et al. look at the specific issue of privacy-preserving ontology matching [7, 8]. In their approach, terms in the ontologies and in the matching rules (which define the mappings) are encrypted, so that the mediator does not see the actual terms. However, during the ontology matching process, which is semi-automatic, a human expert has access to both ontologies in cleartext (using a session key). We propose an automatic privacy-preserving schema matching protocol. The result of this protocol is the set of mappings between attributes in the schemas of the two inter-
منابع مشابه
A Survey of Privacy on Data Integration
This survey is an integrated view of other surveys on privacy preserving for data integration. First, we review the database context and challenges and research questions. Second, we formulate the privacy problems for schema matching and data matching. Next, we introduce the elements of privacy models. Then, we summarize the existing privacy techniques and the analysis (proofs) of privacy guara...
متن کاملA centralized privacy-preserving framework for online social networks
There are some critical privacy concerns in the current online social networks (OSNs). Users' information is disclosed to different entities that they were not supposed to access. Furthermore, the notion of friendship is inadequate in OSNs since the degree of social relationships between users dynamically changes over the time. Additionally, users may define similar privacy settings for their f...
متن کاملPrivacy-preserving Ontology Matching
Increasingly, there is a recognized need for secure information sharing. In order to implement information sharing between diverse organizations, we need privacypreserving interoperation systems. In this work, we describe two frameworks for privacy-preserving interoperation systems. Ontology matching is an indispensable component of interoperation systems. To implement privacy-preserving intero...
متن کاملPrivacy-preserving Statistical Query and Processing on Distributed OpenEHR Data
UNLABELLED Reuse of data from EHRs is essential for many purposes. The objective of the study was to explore how distributed electronic health record (EHR) data can be reused for privacy-preserving statistical query and processing. METHOD We have designed and created a proof of concept prototype solution based on the OpenEHR specification to ensure interoperability and to query the EHRs. XMPP...
متن کاملPrivacy Preserving Record Linkage via grams Projections
Record linkage has been extensively used in various data mining applications involving sharing data. While the amount of available data is growing, the concern of disclosing sensitive information poses the problem of utility vs privacy. In this paper, we study the problem of private record linkage via secure data transformations. In contrast to the existing techniques in this area, we propose a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007